Heterogeneous acceleration of volumetric JPEG 2000 using OpenCL
نویسندگان
چکیده
This paper discusses an OpenCL version of a volumetric JPEG 2000 codec that runs on GPUs, multi-core processors or a combination of both. Since the performance critical part consists of a fine-grained (discrete wavelet transform) and coarse-grained algorithm (Tier-1), the best performance is obtained with a hybrid execution in which the discrete wavelet transform is executed on a GPU and Tier-1 on a multi-core. Using an Intel i7 multi-core in combination with a modest NVIDIA Quadro K620 GPU yields speedups greater than 10 compared with the original sequential code. The performance bottlenecks that arise on GPUs when parallelizing algorithms that are coarse-grained by nature are discussed and also the optimizations that are possible. A performance analysis reveals the inefficiencies and explains the deviations from the GPU peak performance.
منابع مشابه
Evaluation of ‘OpenCL for FPGA’ for Data Acquisition and Acceleration in High Energy Physics
The increase in the data acquisition and processing needs of High Energy Physics experiments has made it more essential to use FPGAs to meet those needs. However harnessing the capabilities of the FPGAs has been hard for anyone but expert FPGA developers. The arrival of OpenCL with the two major FPGA vendors supporting it, offers an easy software-based approach to taking advantage of FPGAs in a...
متن کاملThe Support of an Experimental OpenCL Compiler on HSA Environments
In recent years, with the increasing computing power and programmability on GPU, GPU has become an important role on hardware accelerator. Heterogeneous System Architecture (HSA) announced by HSA Foundation is an approach to benefit both CPUs and GPUs advantages. Open Computing Language (OpenCL) is one of the wellknown programming frameworks for parallel computing on heterogeneous architecture....
متن کاملEnergy-efficient FPGA Implementation of the k-Nearest Neighbors Algorithm Using OpenCL
Modern SoCs are getting increasingly heterogeneous with a combination of multi-core architectures and hardware accelerators to speed up the execution of computeintensive tasks at considerably lower power consumption. Modern FPGAs, due to their reasonable execution speed and comparatively lower power consumption, are strong competitors to the traditional GPU based accelerators. High-level Synthe...
متن کاملWavelet based volumetric medical image compression
The amount of image data generated each day in health care is ever increasing, especially in combination with the improved scanning resolutions and the importance of volumetric image data sets. Handling these images raises the requirement for efficient compression, archival and transmission techniques. Currently, JPEG 2000's core coding system, defined in Part 1, is the default choice for medic...
متن کاملLossless Volumetric Medical Image Compression with Progressive Multi-planar Reformatting Using 3-d Dpcm
In this paper, we propose a novel lossless volumetric medical image compression scheme using three-dimensional differential pulse code modulation (3-D DPCM), which provides an efficient procedure to achieve progressive multi-planar reformatting (MPR) of large 3-D medical data sets. Being separable and commutative in the order of its application, 3-D DPCM provides an opportunity to generate MPR ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- IJHPCA
دوره 31 شماره
صفحات -
تاریخ انتشار 2017